Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix panic when pending pipelinerun is failed #4298

Merged
merged 1 commit into from Oct 12, 2021
Merged

Fix panic when pending pipelinerun is failed #4298

merged 1 commit into from Oct 12, 2021

Conversation

ghost
Copy link

@ghost ghost commented Oct 11, 2021

Changes

Fixes #4297

PipelineRuns that are created with status PipelineRunPending
can be placed into a failed state before their execution begins.
For example: a third-party controller may be watching for pending
PipelineRuns to perform some checks on them prior to execution
beginning. If those checks fail the controller might choose to
set the PipelineRun status to failed with a relevant error message
indicating which check failed and why.

Prior to this commit when a pending PR failed our metrics code
could panic because the PR's StartTime is nil.

This commit adds a guard to the metrics code to ensure that StartTime
is not nil before computing the PR's duration. If it is nil then
we assume the duration is 0. A unit test confirming this behaviour
has been added as well.

/kind bug

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

Release Notes

Fixed an issue where the PipelineRun reconciler could panic if a PipelineRun with spec.status set to PipelineRunPending was placed into a failed state before execution was able to begin.

PipelineRuns that are created with status PipelineRunPending
can be placed into a failed state before their execution begins.
For example: a third-party controller may be watching for pending
PipelineRuns to perform some checks on them prior to execution
beginning. If those checks fail the controller might choose to
set the PipelineRun status to failed with a relevant error message
indicating which check failed and why.

Prior to this commit when a pending PR failed our metrics code
could panic because the PR's StartTime is nil.

This commit adds a guard to the metrics code to ensure that StartTime
is not nil before computing the PR's duration. If it is nil then
we assume the duration is 0. A unit test confirming this behaviour
has been added as well.
@tekton-robot tekton-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/bug Categorizes issue or PR as related to a bug. labels Oct 11, 2021
@tekton-robot tekton-robot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Oct 11, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/pipelinerunmetrics/metrics.go 81.6% 82.0% 0.4

@ghost
Copy link
Author

ghost commented Oct 11, 2021

/test pull-tekton-pipeline-alpha-integration-tests

1 similar comment
@ghost
Copy link
Author

ghost commented Oct 11, 2021

/test pull-tekton-pipeline-alpha-integration-tests

Copy link
Member

@pritidesai pritidesai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @sbwsg for this fix 👍

@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pritidesai

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 11, 2021
@pritidesai
Copy link
Member

read tcp 10.28.0.18:39812->104.18.122.25:443: read: connection reset by peer

/test pull-tekton-pipeline-alpha-integration-tests

@ghost
Copy link
Author

ghost commented Oct 12, 2021

/test pull-tekton-pipeline-alpha-integration-tests

@bobcatfish
Copy link
Collaborator

ooo yikes nice fix!

/lgtm

(side note, apparently we only have 2 test flake issues - im wondering if this PR is a fluke or if we're not recording the flakes that are occurring 🤔 )

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 12, 2021
@pritidesai
Copy link
Member

ooo yikes nice fix!

/lgtm

(side note, apparently we only have 2 test flake issues - im wondering if this PR is a fluke or if we're not recording the flakes that are occurring 🤔 )

I have seen multiple flakes in the past few days, but haven't recorded any 😞 including, #4281 (comment), #4286 (comment), and many more

@tekton-robot tekton-robot merged commit 0299c6c into tektoncd:main Oct 12, 2021
@pritidesai pritidesai added the needs-cherry-pick Indicates a PR needs to be cherry-pick to a release branch label Oct 13, 2021
@bobcatfish
Copy link
Collaborator

@pritidesai i wonder if it would be helpful if we made a template for creating flake bugs specifically to make it easier to record them 🤔

@pritidesai
Copy link
Member

@pritidesai i wonder if it would be helpful if we made a template for creating flake bugs specifically to make it easier to record them 🤔

sorry @bobcatfish for delayed response, just saw this.

Definitely, I think it will be useful to create such template. I am not sure if its possible but when prow reports integration test failure, if we can attach such template so that the PR author can create a new issue.

@ghost ghost removed the needs-cherry-pick Indicates a PR needs to be cherry-pick to a release branch label Jan 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/bug Categorizes issue or PR as related to a bug. lgtm Indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PipelineRun reconciler panics when Pending PipelineRun is immediately placed into failed state.
3 participants